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The study of the RNA three-dimensional structure has 
received a boost by the recent discovery that some RNA 
molecules act as enzymes in several key cellular processes 
in complete absence of protein cofactors [l| . 

As for proteins, the shape of RNA molecules is strictly 
connected to their function, and thus the study of the 
response to external forces helps to understand how 
biomolecules transform mechanical inputs into chemi- 
cal signals. Recent experiments and simulations have 
shown how it is possible to extract information on the 
RNA structure by using force spectroscopy, where RNA 
molecules are manipulated by using controlled forces. 

In particular, remarkable experimental works 0, 0] 
have investigated the connections between the molecular 
structure of RNA hairpins and Tetrahymena thermophila 
ribozyme, and the respective unfolding pathways under 
mechanical stress. 

Motivated by such experiments, several groups have 
proposed theoretical and numerical approaches to the 
mechanical unfolding of RNA molecules 0, @, 0, 0. 
In particular, by using a coarse-grained Go-model and 
Molecular Dynamics (MD) simulations, Thirumalai and 
coworkers have computed phase diagrams and free en- 
ergy landscapes of RNA hairpins , and the unfolding 
pathways of larger, more complex RNA molecules @,0|. 

However, MD simulations are computationally de- 
manding, and even in the simple case of the determi- 
nation of the phase diagram, simulations have to be 
restarted for every choice of the model parameters. Here 
we introduce a simple discrete model for RNA molecules 
under external force, whose thermodynamics is exactly 
solvable, and as such is able to provide exact thermody- 
namical results for any size of the molecules, in a com- 
putation time which is incomparably smaller than the 
time needed for simulations. We exploit this model to 



obtain the force-temperature (/, T) phase diagram of a 
small and a large RNA molecule and their free energy 
landscape as a function of the molecular elongation. 

Furthermore, by using Monte Carlo simulations (MC), 
we investigate the unfolding pathways of the larger 
molecule, finding that the most probable path from the 
native to the unfolded state agrees with the experimen- 
tally determined one. It is worth noting that the present 
model has been used to evaluate the phase diagram, the 
free energy landscape and the unfolding pathways 
fioj ] of widely studied proteins, showing a good degree of 
agreement with the corresponding experimental results. 

The model - We use a Go model defined by the energy 
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H(m, <t) = -J212 '■> A >> LI m k~fL, (1) 
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where L = V,. , I,,".,*- 1 - mj_i)(l - ™j)IIfc=i TO fc is 
the end-to-end length of the molecule and nik = 0, 1 
is associated to the covalent bond between bases k and 
k + 1. mfc = 1 (respectively 0) means that this bond 
is (resp. is not) in a native-like state. Given the state 
of the m variables, for an RNA molecule with N + 1 
bases, n a (m) = 1 + Ylk=i(^ ~ mfc ) orientational de- 
grees of freedom are introduced. Such degrees of free- 
dom, <Xy , describe the orientation of the n CT (m) native- 
like stretches, relative to the external force /. Indeed, if 
a native-like stretch extends from base i to base j > i, 
then (1 — mi_i)(l — m j')ITfc=i m k = 1- Such a stretch 
can be as short as a single base i — j. We set as bound- 
ary conditions mo = tojv+i = 0. The orientation o~ij of a 
native-like stretch is also a binary variable: <7y- = +1 (re- 
spectively — 1) represents a stretch oriented parallel (resp. 
antiparallel) to the external force /. Uj is the length of 
the i—j native stretch, taken from the Protein Data Base 
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(PDB) and defined as the distance between the phospho- 
rus atoms of bases i and j + 1. Ay is the element of 
the contact matrix, which takes the value 1 if the bases 
i and j + 1 are in contact, and otherwise. Within the 
present model, bases i and j + 1 are considered to be 
in contact if at least two atoms, one from each base, are 
closer than S = 4 A. Finally is the corresponding inter- 
action strength, which is proportional to the number of 
atom pairs, which are in contact according to the above 
criterion. We have shown [9] that, as far as the equi- 
librium thermodynamics is concerned, the sum over the 
a variables can be performed exactly, and in the / = 
case a well-known Ising-like model of protein folding is 



obtained Stacking interactions are implicitly taken 



into account in the present model: one can easily check 
that, for instance, the formation of a contact between 
bases i + 1 and j — l (if present in the native structure) is 
a necessary condition for the formation of an i — j native 
contact. 
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Figure 1: (Color online) Force- Femperature phase diagrams 
of the 1EOR molecule. Upper panel: average order parameter 
(m). Fower panel: average end-to-end length (L). 

Simple hairpin (P5GA) - We first consider a 22- 
nucleotides RNA hairpin, (PDB code 1EOR, see ref.[l2| 
for the secondary structure), which is similar to the P5ab 
in the P5abc domain of group I intron 0. We first focus 
on the equilibrium properties of the molecule, and then 
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Figure 2: Free energy landscape Fq of the 1EOR molecule as 
a function of the molecular elongation L. Inset, tilted FEL 
F Q - fL, with / = 15.4 pN. 



study the unfolding kinetics induced by external forces. 

In fig. [TJ the (/, T) phase diagrams are plotted. They 
show that P5GA behaves like a two-state system, the 
transition region from the folded to the unfolded state 
being quite narrow. It is also in good agreement with 
that by Hyeon and Thirumalai [f| for the same molecule. 

An interesting quantity that characterizes the stability 
of a biopolymer, is the free energy landscape (FEL) Fq 
as a function of its end-to-end elongation L, defined as 



F (L) = -fc B Tln 



Y^e- pHo( - x) 8{L{x) -L) 



(2) 



where x is the microscopic state of the system, and the 
sum is restricted to those states, whose value of the 
macroscopic variable L(x) is equal to the argument L 
of F . Here H Q (x) corresponds to the hamiltonian ([1]) 
with / = 0. As discussed in [§], F Q (L) can be exactly 
computed in the present model. The landscape is plot- 
ted in fig. [21 for T = 300 K. When an external constant 
force / is applied to the molecule free ends, one gets 
the tilted landscape F(L,f) = F (L) — f L, which is 
plotted in fig. [2j for / = 15.4 pN. From the phase dia- 
gram in fig. Q] one sees that for this value of the force, 
the molecule length is about half of its maximum value 
L max ~ 13.8 nm. At the same time, the FEL exhibits 
two wells at small and large elongation, indicating that 
the molecule hops from the folded to the unfolded state, 
for this value of the force [5J1. The FEL of the same 
molecule was obtained in [1,13, S| by using an off-lattice 
coarse grained model and MD simulations. In these pa- 
pers, F(L, f) was estimated by analyzing the kinetics of 
the molecule under force, i.e., by sampling the occupa- 
tion frequency of those states with a given elongation L. 
This method is expected to give reliable results only for 
small molecules, like the one at issue, where the phase 
space of the molecule is sampled according to its equi- 
librium phase space distribution. In the case of larger 
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Figure 3: (Color online) Force- Temperature phase diagrams 
of the T. thermophila ribozyme. Upper panel: average order 
parameter (m) . Lower panel: average end-to-end length (L) . 

molecules, at small forces, one expects that the states 
with large L are not sampled with the correct frequency, 
as computer simulations usually fail to visit rare states. 
On the contrary, the phase diagrams and landscapes, as 
given by the present model, are exact results, and can be 
obtained for any molecule size and any value of / and T. 

Tetrahymena thermophila ribozyme (1GRZ) - In the 
following we investigate the thermodynamical equilib- 
rium properties and the mechanical unfolding of the 
Tetrahymena thermophila ribozyme [HI, PDB code 
1GRZ, whose mechanical unfolding has been studied 
both experimentally 0] and with computational tech- 
niques [6|. We consider the structured part from base 
96 to 331, which exhibits several secondary structure el- 
ements (SSE), named Pn with n=3, . . . , 9, see ref. (l2| . 

In fig. [3]we plot the (/, T) phase diagram of the 1GRZ 
molecule. At variance with the case of the hairpin, this 
larger molecule exhibits a wider transition region, reflect- 
ing the presence of intermediate states along the unfold- 
ing pathway, as will be discussed in detail below. 

In fig. [U we plot the unperturbed FEL Fq as a func- 
tion of the end-to-end length L at T = 300 K. In the 
same figure, we plot the tilted FEL F(L, f) for two val- 
ues of the external force / = 8.82, 17.64 pN. From the 
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Figure 4: Free energy landscape Fa of the 1GRZ molecule as 
a function of the molecular elongation L, for T = 300 K (full 
line). Tilted FEL F(L, /) for two values of the force / = 8.82 
pN (dashed line) and / = 17.64 pN (dotted line, inset). 

phase diagram in fig. [3] one sees that for / = 17.64 and 
T = 300 K, the average end-to-end length is half of its 
maximum value L max ~ 155 nm. The FEL at / = 17.64 
pN exhibits two major and one minor energy wells, in- 
dicating the coexistence of three states characterized by 
three different values of L for this force. 

In order to study the mechanical unfolding of the 
ribozyme, we consider here the experimental protocol 
where the external force is applied by tethering the 
molecule to a colloidal particle trapped in an optical trap: 
fit) = k(X(t)—L(t)), where L(t) is the end-to end length 
of the molecule at time t, while k and X(t) = r ■ t are 
the stiffness and the center of the trap, respectively. This 
experimental setup corresponds to that used in ref. 0], 
which we will use as a reference to compare our results. 

In ref. [1(3], the present model was used to trace the 
state of SSEs of a protein. Similarly, one can moni- 
tor the unfolding of a single SSE of 1GRZ by consid- 
ering a suitable order parameter for each SSE. Here we 
choose the fraction of native contacts within an element. 
The order parameter of the SSE Pn will be defined as 
0p n = Ay ni=i m k/Np a , where the prime means 
that i and j run over those bases belonging to Pn, and 
iVp n = JZjj Ay is the total number of native contacts in 
Pn. This approach allows us to trace the current state of 
each SSE. The unfolding time of a given SSE is defined 
as the time at which the corresponding order parameter 
crosses a given threshold <j) u = 1/3 for the first time [lj| . 

In order to find the typical unfolding pathways, we con- 
sider 1000 trajectories, simulated with a standard Monte 
Carlo algorithm [§], where the trap stiffness and the ve- 
locity take the values k = 13 pN/nm, and r = 0.36 
nm/(MC Step). Time is a discrete variable counting the 
number of MC steps. A typical trajectory is plotted in 
fig. [5] In the force-extension curve unfolding peaks can 
be observed. By comparing the position of the peaks 



4 



25 




-^1 





20 


40 


60 80 
X (nm) 


100 


120 


140 


1 




J ! 








M 


P9 




0.8 












L . 

k 


P378 
P4P6 
P5 




0.6 












'"h 


P5abc 

I 




0.4 














i. 

i 
t 




0.2 














» 


i 

\ :i - 


(b) 


3 


20 


40 


60 80 


100 


120 


140 



X (nm) 

Figure 5: (Color online) Typical unfolding trajectory of the 
1GRZ molecule, (a) force as a function of the position of the 
"optical trap" center X. (b) order parameter of the whole 
system and of the single SSEs as functions of X. 

with the drops in the SSEs order parameters in fig. E^b), 
we can associate peaks in fig. [5fa) to the unfolding of 
SSEs. The equilibrium behaviour of these order param- 
eters is reported and discussed in [l2j]. The unfolding 
pathways corresponding to the 1000 trajectories can be 
easily clustered into two big sets. The first set (622 tra- 
jectories) corresponds to the pathway P9 ^P34678 -^P5 
^P5a -^P5bc. This means that P9 is the first SSE to 
unfold, followed by the SSEs P3, P4, P6, P7 and P8 
with no definite order among them, and so on. This is 
consistent with the experimental pathway Q P9 — >P378 
— >P46 -^P5 -^P5abc, except for lumping together P378 
and P46 and for splitting P5abc. This set of trajectories 
can be analyzed in more detail, looking for finer subdi- 
visions, and one finds that 429 out of these 622 trajec- 
tories correspond to the pathway P9 -^P78 -^P3 -^P46 
— >P5 ^P5a — ^P5bc. Even at this finer level our results 
are consistent with the experimental ones 0] , and we can 
also predict a definite order of unfolding events within the 
domains P378 and P5abc. Moreover, we find an alterna- 
tive, less probable (355 trajectories out of 1000) pathway, 
P9 ->P3 -^P46 -^P5 -^P5a -^P5bc -^P78, where P78 
are the last domains to unfold. The order of the unfold- 



ing events does not appear to be affected by moderate 
variations of the threshold S used to obtain the model 
interaction parameters £y [l2j]. It is worth to note that 
such parameters take implicitly into account the effect of 
counterions on the stability of the native structure, see 
same reference. 

To summarize, the present model turns out to be able 
to provide the equilibrium properties of an RNA hairpin 
with minimal computational efforts in comparison with 
more detailed molecular models. This feature allows us 
to extend our investigation to the equilibrium proper- 
ties of a large molecule, namely the Tetrahymena ther- 
mophila ribozyme. The model is also able to reproduce 
the experimental behaviour of the ribozyme under me- 
chanical loading, providing additional information on the 
unfolding events at a microscopic level not accessible to 
experiments, similarl y to what obtained for the analysis 
of protein unfolding [l(|. These results clearly indicate 
that such a model captures the basic universal features 
underlying the mechanical unfolding of biopolymers. 
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Appendix to "Equilibrium properties and force-driven unfolding path- 
ways of RNA molecules" 



MOLECULAR STRUCTURES 



In this section we report the secondary structures of the RNA fragments we have studied with our model. 
In fig. IA.6I we report the secondary structure of the simple hairpin P5GA (pdb code 1E0R). 

In addition, in fig. IA.7I we report the secondary structure of the main structured portion of the Tetrahymena 
thermophila ribozyme (pdb code 1GRZ). This is the portion we have considered in our work and goes from base 96 
to 331, according to the pdb numbering. In the figure, the numbers which are adjacent to bases correspond to the 
pdb numbering, and the labeling of hairpins follows [13j |. where the secondary structure of the whole molecule can 
be found. It is worth to note, that in that work the authors obtained crystals of the ribozyme by freezing solutions 
containing 50 mM of MgCl2. This concentration is compatible with the experimental conditions of, e.g., Thus, 
the native structure, as given in the protein data base PDB, is affected by the presence of ions Mg 2+ , which stabilize 
the long range, tertiary contacts. As discussed in the text, in generating our model interaction energies e^ , we take 
into account those atom pairs whose distance is smaller than a threshold distance. Thus the effect of the Mg+ ions 
is implicitly taken into account in our model. It is commonly believed [2,3] that tertiary contacts are bottlenecks for 
the molecular unfolding, which manifest as peaks in the force-extension curve, like those in fig. 5a of the main text. 
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Figure A. 6: Secondary structure of 1EOR 



COMPARISON OF THE OUT-OF-EQUILIBRIUM UNFOLDING PATHWAYS WITH THE 

EQUILIBRIUM CONFIGURATIONS 

In fig. IA.8I the order parameter as a function of the molecular elongation L, at T = 300 K. Inspection of this 
curve suggests that at such a temperature, P9 is partially unfolded, and as the molecular elongation increases, the 
unfolding of P3, P4,P6, P7, and P8 start at L = 6 nm, the unfolding of P4 and P9 being slower than the others. At 
L — 65 nm, the P3, P6, P7 and P8 are completely unfolded, while the curves for P4 and P9 exhibits a shoulder. At 
the same value of the elongation, the SSEs P5, P5a, P5b, and P5c start to unfold too, and for L ~ 115 nm only a 
small fraction of P9 and of P5abc is not yet unfolded. Such a figure has to be compared with fig. 4 and fig. 5 in 
the main text. The picture emerging is consistent with the typical unfolding pathways found during the simulations: 
P9 -^P34678 — »P5 -^P5a — +P5bc. It is worth to note, however, that in out-of-equilibrium pulling manipulations the 
unfolding is a stochastic process, and thus the sequence of the unfolding events may vary from one realization of the 
process to another. For example in fig 5.b in the main text one clearly distinguish the unfolding events of P378 and 
P4P6, while in the equilibrium picture, fig. IA.8[ the unfolding of those elements occurs at almost the same length. 
Similarly, the characteristic lengths emerging from analysis of fig. IA.8( i.e. L = 6, 65 nm correspond to minima in 
the energy landscape plotted in the inset of fig. 4 in the main text, suggesting that those minima correspond to the 
typical configurations of the molecule immediately before the unfolding of a specific group of SSEs. 
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Figure A. 7: Secondary structure of 1GRZ 



SUPPLEMENTARY SIMULATIONS 



1EOR 

Here, we describe unfolding simulations of the 1EOR molecule. We consider first the constant force setup (force 
clamp): a constant force / = 15.4 pN is applied to the molecule, and its length is traced as a function of time. A 
typical trajectory is plotted in fig. IA.91 Inspection of this figure clearly indicates that the molecule hops back and 
forth between two states characterized by L ~ 4 nm and L ~ 10 nm, which correspond to the two minima in the 
energy landscape of the molecule, see inset of fig. 2 in the main text. 

We now consider the dynamic-loading set up. As in the main text, the force applied changes as a function of time 
as fit) = k(X(t) — L(t)), where Lit) is the end-to end length of the molecule at time t, while A; and X(t) = r ■ t 
are the stiffness and the center of the external potential, respectively. We take the values k — 13 pN/nm, and r = 1 
nm/(MC step). We find that the molecule unfolds exhibiting a peak in the force-elongation curve, at / ~ 12 pN, in 
very good agreement with ref. 0. 

1GRZ 

As stated in the main text, in the present RNA model, bases i and j + 1 arc considered to be in contact if at least 
two atoms, one from each base, are closer than 8 = 4 A. We now study the possible effects of the threshold 5 on the 
unfolding pathway, and in particular on the force-elongation curve. 

We consider the values 8 — 3.5, 4.5, 8 A, and run unfolding simulations as those described in the main text. Typical 
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Figure A. 8: Equilibrium order parameter <j> as a function of the equilibrium molecular elongation L, at T = 300 K. 
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Figure A. 9: Typical unfolding trajectory of the 1EOR molecule at T = 300 K, under constant force. 



trajectories are plotted in fig. IA.11I inspection of these figures suggests that increasing 5 has no net effect on the 
unfolding pathway, the typical unfolding sequence being that of fig. 5 in the main text. For 6 — 3.5 however, the 
peaks in the force-extension curve appear to be flattened out, as one would expect, because of the reduced number 
and intensity of contact interactions. 
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Figure A. 10: Typical unfolding trajectory of the 1EOR molecule at T — 300 K, under time- varying force. X is the position of 
the center of the external quadratic potential. 
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Figure A.ll: Typical unfolding trajectory of the 1GRZ molecule with modified interaction parameters e^. The parameters 
have been obtained with S = 3.5 (a) , 4.5 (b) and 8 (c) A. 



